Preventing Catastrophic Interference in Multiple-Sequence Learning Using Coupled Reverberating Elman Networks
نویسندگان
چکیده
Everyone agrees that real cognition requires much more than static pattern recognition. In particular, it requires the ability to learn sequences of patterns (or actions) But learning sequences really means being able to learn multiple sequences, one after the other, wi thout the most recently learned ones erasing the previously learned ones. But if catastrophic interference is a problem for the sequential learning of individual patterns, the problem is amplified many times over when multiple sequences of patterns have to be learned consecutively, because each new sequence consists of many linked patterns. In this paper we will present a connectionist architecture that would seem to solve the problem of multiple sequence learning using pseudopatterns. Introduction Building a robot that could unfailingly recognize and respond to hundreds of objects in the world – apples, mice, telephones and paper napkins, among them – would unquestionably constitute a major artificial intelligence tour de force. But everyone agrees that real cognition requires much more than static pattern recognition. In particular, it requires the ability to learn sequences of patterns (or actions). This was the primary reason for the development of the simple recurrent network (SRN, Elman, 1990) and the many variants of this architecture. But learning sequences means more than being able to learn a single, isolated sequence of patterns: it means being able to learn multiple sequences, one after the other, without the most recently learned ones erasing the previously learned ones. But if catastrophic interference – the phenomenon whereby new learning completely erases old learning – is a problem with static pattern learning (McCloskey & Cohen, 1989; Ratcliff, 1990), the problem is amplified many times over when multiple sequences of patterns have to be learned consecutively, because each sequence consists of many new linked patterns. What hope is there for a previously learned sequence of patterns to survive after the network has learned a new sequence consisting of many individual patterns? In this paper, we will present a connectionist architecture that solves the problem of multiple sequence learning. Catastrophic interference The problem of catastrophic interference (or forgetting) has been with the connectionist community for well over a decade now (McCloskey & Cohen, 1989; Ratcliff, 1990; for a review see Sharkey & Sharkey, 1995). Catastrophic forgetting occurs when newly learned information suddenly and completely erases information that was previou sly learned by the network, a phenomenon that is not only implausible cognitively, but disastrous for most practical applications. The problem has been studied by numerous authors over the past decade (see French, 1999 for a review). The problem is that the very property – a single set of weights to encode information – that gives connectionist networks their remarkable abilities of generalization and graceful degradation in the presence of incomplete information are also the root cause of catastrophic int erference (see, for example, French, 1992). Various authors (Ans & Rousset, 1997, 2000; French, 1997; Robins, 1995) have developed systems that rehearse on pseudo-episodes (or pseudopatterns), rather than on the real items that were previously learned. The basic principle of this mechanism is when learning new external patterns to interleave them with internally-generated pseudopatterns. These latter patterns, self-generated by the network from random activation, reflect (but are not identical to) the previously learned information. It has now been established that this pseudopattern rehearsal method effectively eliminates catastrophic forgetting. A serious problem remains, however, and that is this: cognition involves more than being able to sequentially learn a series of "static" (non-temporal) patterns without interference. It is of equal importance to be able to serially learn many of temporal sequences of patterns. We will propose an pseudopattern-based architecture that can effectively learn multiple temporal patterns consecutively. The key insight of this paper is this: Once an SRN has learned a particular sequence, each pseudopattern generated by that network reflects the entire sequence (or set of sequences) that has been learned .
منابع مشابه
Artificial neural networks whispering to the brain: nonlinear system attractors induce familiarity with never seen items
Attractors of nonlinear neural systems are at the core of the memory self-refreshing mechanism of human memory models that suppose memories are dynamically maintained in a distributed network [Ans, B., and Rousset, S. (1997), ‘Avoiding Catastrophic Forgetting by Coupling Two Reverberating Neural Networks’ Comptes Rendus de l’Académie des Sciences Paris, Life Sciences, 320, 989–997; Ans, B., and...
متن کاملSequential Learning in Distributed Neural Networks without Catastrophic Forgetting: A Single and Realistic Self-Refreshing Memory Can Do It
− In sequential learning tasks artificial distributed neural networks forget catastrophically, that is, new learned information most often erases the one previously learned. This major weakness is not only cognitively implausible, as human gradually forget, but disastrous for most practical applications. An efficient solution to catastrophic forgetting has been recently proposed for backpropaga...
متن کاملNeural networks with a self-refreshing memory: knowledge transfer in sequential learning tasks without catastrophic forgetting
We explore a dual-network architecture with self-refreshing memory (Ans and Rousset 1997) which overcomes catastrophic forgetting in sequential learning tasks. Its principle is that new knowledge is learned along with an internally generated activity re ecting the network history. What mainly distinguishes this model from others using pseudorehearsal in feedforward multilayer networks is a rev...
متن کاملMeaningful Representations Prevent Catastrophic Interference
Artificial Neural Networks (ANNs) attempt to mimic human neural networks in order to perform tasks. In order to do this, tasks need to be represented in ways that the network understands. In ANNs these representations are often arbitrary, whereas in humans it seems that these representations are often meaningful. This article shows how using more meaningful representations in ANNs can be very b...
متن کاملMethods for integrating memory into neural networks applied to condition monitoring
A criticism of neural network architectures is their susceptibility to “catastrophic interference” the ability to forget previously learned data when presented with new patterns. To avoid this, neural network architectures have been developed which specifically provide the network with a memory, either through the use of a context unit, which can store patterns for later recall, or which combin...
متن کامل